The DC-Tree: A Fully Dynamic Index Structure for Data Warehouses

نویسندگان

  • Martin Ester
  • Jörn Kohlhammer
  • Hans-Peter Kriegel
چکیده

Many companies have recognized the strategic importance of the knowledge hidden in their large databases and have built data warehouses. Typically, updates are collected and applied to the data warehouse periodically in a batch mode, e.g., over night. Then, all derived information such as index structures has to be updated as well. The standard approach of bulk incremental updates to data warehouses has some drawbacks.First, the average runtime for a single update is small but the total runtime for the whole batch of updates may become rather large. Second, the contents of the data warehouse is not always up to date. In this paper, we introduced the DC-tree, a fully dynamic index structure for data warehouses modeled as a data cube. This new index structure is designed for applications where the above drawbacks of the bulk update approach are critical. The DC-tree is a hierarchical index structure similar to the X-tree exploiting the concept hierarchies typically defined for the dimensions of a data cube. The DC-tree uses minimum describing sets and the partial ordering of the attribute values induced by the concept hierarchies instead of minimum bounding rectangles and an artificial total ordering. Furthermore, for each minimum describing set in the directory the values of the measure attributes are materialized. We conducted an extensive experimental performance evaluation using the TPC-D benchmark data. Our results demonstrate that the DC-tree yields a significant speed-up compared to the X-tree and the sequential search when processing general range queries on a data cube.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic cubing for hierarchical multidimensional data space

Data warehouses are being used in many applications since quite a long time. Traditionally, new data in these warehouses is loaded through offline bulk updates which implies that latest data is not always available for analysis. This, however, is not acceptable in many modern applications (such as intelligent building, smart grid etc.) that require the latest data for decision making. These mod...

متن کامل

ارائه روشی پویا جهت پاسخ به پرس‌وجوهای پیوسته تجمّعی اقتضایی

Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...

متن کامل

Malmquist Productivity Index with Dynamic Network Structure

Data envelopment analysis (DEA) measures the relative efficiency of decision making units (DMUs) with multiple inputs and multiple outputs. DEA-based Malmquist productivity index measures the productivity change over time. We propose a dynamic DEA model involving network structure in each period within the framework a DEA. We have previously published the network DEA (NDEA) and the dynamic DEA ...

متن کامل

The Dimension-Join: A New Index for Data Warehouses

There are several auxiliary pre-computed access structures that allow faster answers by reading less base data. Examples are materialized views, join indexes, B-tree and bitmap indexes. This paper proposes dimension-join, a new type of index especially suited for data warehouses. The dimension-join borrows ideas from several concepts. It is a bitmap index, it is a multi-table join and when bein...

متن کامل

A local measurement-based protection scheme for DER integrated DC microgrid using Bagging Tree

In recent years, DC microgrid has attracted considerable attention of the research community because of the wide usage of DC power-based appliances. However, the acceptance of DC microgrid by power utilities is still limited due to the issues associated with the development of a reliable protection scheme. The high magnitude of DC fault current, its rapid rate of rising and absence of zero cros...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000